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Abstract 

The profusion of spectral bands generated by the acquisition process of hyperspectral 
images generally leads to high computational costs. Such difficulties arise in particular with 
nonlinear unmixing methods, which are naturally more complex than linear ones. This 
complexity, associated with the high redundancy of information within the complete set of 
bands, make the search of band selection algorithms relevant. With this work, we propose 
a band selection strategy in reproducing kernel Hilbert spaces that allows to drastically 
reduce the processing time required by nonlinear unmixing techniques. Simulation results 
show a complexity reduction of two orders of magnitude without compromising unmixing 
performance. 

Index terms — Hyperspectral data, nonlinear unmixing, band selection, kernel methods 

1 Introduction 

Hyperspectral images (HI) consist of hundreds or even thousands of contiguous spectral samples 
ranging from the visible to the near infrared portions of the light spectrum. His trade spacial for 
spectral resolution [^, a consequence of which is that observed HI pixels can be mixtures of the 
spectral signatures of the materials present in the scene. The spectral signature of each material 
is usually available as a vector whose elements are proportional to the reflectances associated 
with that material at each frequency band. Such vectors are typically called endmembers due 
to their geometrical interpretation in the linear mixing case. The analysis of His frequently 
aims at unmixing the spectral information present at each pixel in the image, a study known as 
hyperspectral unmixing (HU). Unmixing problems can be cast as supervised or unsupervised 
learning problems depending on whether the endmembers are known or not. 

Several models have been proposed in the literature to describe the mixing process of spectral 
information in His. The simplest one is the linear mixing model (LMM), in which each observed 
pixel spectrum is modeled as a linear combination of the endmembers [^. Though the LMM 
simplifies the mathematical treatment of the unmixing problem, it has been recognized that 
significant nonlinear effects are often present in the spectral mixing occurring in real images . 
Nonlinear mixing occurs, for instance, due to multiple interactions among light and different 
materials during the acquisition process. Recognition of such nonlinear effects has led to several 
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5 and 141094/2012-5, and by the Agence Nationale pour la Recherche, France, (Hypanema project, ANR-12- 
BS03-003), and by ANR-11-LABX-0040-CIMI within the program ANR-11-ID EX-0002-02. 
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nonlinear mixing models for HI processing. These models include adding cross-terms of different 
endmembers to the LMM , using bilinear mixture models , post-nonlinear mixing models 
[^, and kernel-based models [^. Most nonlinear unmixing techniques are based on Bayesian 
inference [^, on using manifold learning techniques and geodesic distances [9 -11 , or processing 
data in reproducing kernel Hilbert spaces (RKHS) [8,12 


One of the problems in practical implementation of unmixing algorithms is the profusion of 
spectral bands generated in the acquisition process, which leads to high computational costs. 
This is especially true for nonlinear unmixing algorithms, which are naturally more complex 
than linear techniques. Such inherent complexity, associated with the high redundancy within 
the complete set of bands, make the search of band selection techniques natural and relevant [^ . 
Several band selection algorithms have been proposed for linearly mixed His, which generally 
requires solving an optimization problem [14] . Nonlinear unmixing presents an even more 
challenging problem for band selection. 

In this paper, we propose a technique for band selection in nonlinear supervised HI unmixing 
problems. This method applies a kernel k-means algorithm to identify nonlinearly separable 
clusters of spectral bands in the corresponding RKHS. The solution is evaluated using the SK- 
Hype nonlinear unmixing algorithm [^ on the selected bands. Simulation results indicate a 
complexity reduction of two orders of magnitude without compromising unmixing performance. 

The paper is organized as follows. First, we state the unmixing problem and introduce 
usual nonlinear mixing models. Then, we describe a nonlinear unmixing algorithm operating in 
RKHS. Next, we introduce our band selection algorithm based on kernel k-means. We provide 
promising simulation results to illustrate the performance of our approach. Finally, we present 
some concluding remarks. 


2 Mixture Models 

Each observed HI pixel can be written as a function of the endmembers and a noise component 
that covers the unknown or unmodeled factors in the system: 

r = ^(AT) -I- n (1) 

with r = [n,... ,rL]^ the observed pixel vector over L spectral bands, M = [mi ,..., m/j] the 
Lx R matrix of endmembers ruj, n a noise vector, and an unknown function. Several mixing 
models have been presented in the literature, depending on the linearity or nonlinearity of T', 
type of mixture, scale, and other properties [^. 

2.1 The Linear Mixing Model (LMM) 

The LMM assumes that is a convex combination of the endmembers, that is, 

r = Mcx + n 

T (2) 

subject to 1 Q; = 1, and o: ^ 0 

where the vector a = [ai ,..., contains the proportions (or abundances) of each endmem- 
ber in M and, therefore, cannot be negative and should sum to one. The observation at the 
Kth spectral band in Q can be written as 

ri = + Ui (3) 

where is the £-th row of M. For = 0 (noiseless case), the sum-to-one and positivity 
constraints on the abundances in Q confine the data (pixels) to a simplex, for which the vertices 
are the endmembers. 
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2.2 Nonlinear Mixing Models 

Different parametric nonlinear models have been proposed in the literature (see, e.g., and 
references therein). Here we review two popular models that will be used later on to generate 
synthetic data for evaluation purposes. 

The generalized bilinear model (GBM) is given by 

fl-i R 

r = Mol + E E dijOiOjirii © rrij + n 

i=i j=i+i (^) 

subject to ot = 1, and o: © 0 

where the parameters 5ij G [0,1] govern the amount of nonlinear contribution, and 0 denotes 
the Hadamard product. For simplicity, we consider in the following a simplihed version of this 
model, with a single parameter 5 to control the nonlinear contribution such that 6ij = 6 for all 
(bj)- 

The post nonlinear mixing model (PNMM) [15| is 

r = g{Mci)+n (5) 


where gf(-) is a nonlinear function applied to the result of a linear mixing. The PNMM can 
represent a wide range of nonlinear mixing models, depending on g{-). For instance, the PNMM 
considered in is defined as 

r = (Mol)^ + n (6) 


where denotes the exponential value ^ applied to each entry of the input vector v. The 
PNMM was explored in other works using different forms for g{-) applied to HU [6,16|T7|. 

The GBM and the PNMM Q nonlinear mixing models consider mainly the scattering 
effect in light interactions with endmembers. Other models consider different nonlinear effects, 
depending on the characteristics of the application, such as the type of mixture, the types of 
materials, the geometry of the reflection surface, and the constraints over the abundances 18- 
22 . More importantly, these informations are usually missing in HU problems. Therefore, it 


makes sense to consider nonlinear unmixing strategies that do not make strong assumptions 
about the nonlinearity in the mixture. 


3 Nonlinear unmixing strategy 


This section reviews the SK-Hype algorithrcQ for nonlinear unmixing of His . It considers 
the mixing model consisting of a linear trend parametrized by the abundance vector a and a 
nonlinear residual component if:. This model is given by 

r£ = uoL^mx^ + {I - u)ip{mxf)+ n£ (7) 


where u € [0,1] controls the amount of linear contribution to the model and is an unknown 
function in an RKHS H. 

Kernel methods are efficient machine learning techniques that were initially introduced for 
solving nonlinear classification and regression problems. They consist of linear algorithms op¬ 
erating in high dimensional feature spaces into which the data have been mapped using kernel 
functions 23 . These approaches are based on the framework of reproducing kernels which states 
that, for any positive kernel K{mXi,mXj), there exists a Hilbert space Td with inner product 
{■ ,-)'H and a mapping 


$ : 




H 


^Matlab code available at www.cedric-richard.fr 


( 8 ) 

(9) 
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such that K{mx.,mx-) = {^{mx^), ^{mx.))'H- This last property allows to implicitly compute 
inner products in Ti by evaluating a real function, K{mx^,mx ), in the input space. Other 
useful properties are 'ijj{mxj) = ('ll), K{-,mXj))n for all V' fo and the reproducing property 
K{mx„mx^) = {K{-,mxJ,K{-,mx^))n- 

The optimization problem proposed in for estimating the unknown variables ot, and 
u in Q is 


■ 1 „2 

mm - — a + 


It,ill,u 2 \u 


1 — u 


1 


^ £=1 


( 10 ) 


subject to Q: ^ 0 and ^ct = 1. This convex problem can be solved using a two stage alternating 
iterative optimization procedure with respect to (o:, ijj) and u. 


3.1 Solving with respect to {cx,4’) 


Introducing the Lagrange multipliers (3 and 7 , the dual problem of (10) for fixed u is given 
by | 8 ] 


max G{u, (3, 7 ) = 

/ 3,7 


1 

2 



Ku + 

uM \ 


uM' 

ul ) 

V 7 


+ 




subject to 7^0 


( 11 ) 


with Ku = uMM~^ + (1 — u)K, and K the Gram matrix such that Kij = K{mxi,mxj). This 
leads to the following solution of primal problem: 

( rt* = 

< 'll;* = {l-u)Y,e=i f^£ !<-{■, 

. e* = ///?; 

where (3* and 7 * are the solutions of 


3.2 Solving with respect to u 


Using the stationary conditions (12), the optimum value for u given /3* and 7 * can be computed 
each iteration as 24 

u* = {l + {l-u* ) / ^ ( 13 ) 

where u^i is the optimum u* obtained at the previous iteration. 


4 Band selection and unmixing 

Band selection is the strategy of choice for reducing the complexity of HI processing when the 
original spectral information of each pixel needs to be preserved [25] . Most existing band selec¬ 
tion algorithms have been derived assuming linearly mixed His. To the best of our knowledge, 
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the problem of band selection for a single nonlinearly mixed pixel and with the objective of 
reducing the complexity of the unmixing process is still largely untreated in the literature. As 
each pixel may be better characterized by a different set of bands, we propose a simple super¬ 
vised band selection approach to be applied to individual pixels so that only those endmembers 
present in each pixel, that is, the matrix M for that pixel, are considered. The clustering 
technique proposed in the following is based on the kernel k-means (KKM) algorithm [26j , and 
uses the fact that each band of the HI is a function of the elements in one row of matrix M. 
The choice of the KKM algorithm is due to the practical assumption that we lack information 
about the nonlinearity associated to the endmember mixing in each HI. 


4.1 Kernel k-means for band-selection 


For band selection we consider each row of M as an element of a vector space, and search 
for a set of K disjoint clusters Ci,... ,Ck in that space. Then, a unique wavelength is chosen 
using the KKM algorithm to represent each cluster. KKM is an extension of the standard 
k-means clustering algorithm that identifies nonlinearly separable clusters 26 . It maps the 


data implicitly to a RKHS % where it performs a conventional k-means algorithm. Since the 
computation of the centroids in space % is intractable, KKM algorithms use the reproducing 
property to directly compute squared distances between points in a cluster Ck- Therefore, 
given a cluster Ck enclosing points {k(-, we can write the centroid ^k as 


f^k — 




Nk ^ 


(14) 


ieCfc 


where Nk is the number of points in Ck- The squared distances to the centroid of Ck are then 
computed as 


\K{-,mxi) - HkWn = K{mxg,mxf) 
1 


Nk ^ 


^ K{mxf,mxi 


ieCk 


(15) 


1 


^ ieCkjeCk 


and the cluster error is defined as 


K 


K(/ii,... ,/iig) = EE \\K{-,mx^ 

k=i £eCfe 


,, l|2 

k-kWn- 


(16) 


To preserve the original HI band information, we propose to represent cluster Ck by the 
band £k corresponding to the closest point to the centroid fik- 


= argmin||K(-,mAj 

£eCfe 


(17) 


The global kernel k-means (GKKM) algorithm uses the principles above for incremental 
clustering [^. GKKM avoids poor local minima and produces near-optimal solutions that are 
robust to cluster initialization. A fast GKKM (FGKKM) version that performs a unique KKM 
run and greatly reduces the complexity of the algorithm can also be used. Algorithm details 
the application of FGKKM algorithm using ( |17[ ) for band selection. We refer to Algorithm as 
KKMBS for short. 
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Algorithm 1: FGKKM Band Selection (KKMBS) 

Input : The L x R matrix M and the desired number of bands Ni,. 
Output: Selected band indices £. 

1 % Find clusters 

2 [Ci,...,CArJ =FGKKM(M,iVfe); 

3 % Find the lines of M closer to the centroids in Ci,... 

4 for k = 1 to Nb do 

5 I 4 = argmin^gcJI«(-)"^Aj - 

6 end 

7 return £ 


4.2 Unmixing of rednced data 


For a given A-pixel HI region with a known endmember matrix iW, define R = [ri,..., utv] as 
the matrix of observations. The unmixing problem then reduces to an inversion step for which 
we solve the optimization problem described in Section to obtain the estimated abundances 
given by A = [q^, ..., a^]. We propose to replace matrices R and M by their reduced versions 
Rr and containing only N^, selected bands obtained with Algorithmic Then, the SK-Hype 
algorithm can be used to perform data unmixing and to estimate the abundance matrix as 
shown in Figure [C 


R 


KKMBS 


Mr- 


Rr 


SK-Hype 


Figure 1: Band selection and unmixing. 


5 Simulation results 


This section presents simulation results with synthetic data to illustrate the potential of the band 
selection technique. The HI was built using measured spectra from eight materials: alunite, 
calcite, epidote, kaolinite, buddingtonite, almandine, jarosite and lepidolite. Their spectra were 
selected from the spectral library of the ENVI software and consisted of 420 contiguous bands, 
covering wavelengths from 0.3951 to 2.56 micrometers. We compared the results obtained with 
2 nonlinear mixing models using 5 unmixing algorithm implementations and 2 performance 
measures: the root mean square error (RMSE) and the relative execution time (RET). 

RMSE = ||A -A\\f/{NR) (18) 


where ||A|4 denotes the Frobenius norm of matrix X. The RET for the p-th algorithm was 
determined as RETp = 4 /^fclS) where tp is the execution time for the p-th algorithm and 
tpcLS is the execution time for the Fully Gonstrained Least Squares (FCLS) algorithm. 

The data were nonlinearly mixed using the PNMM and GBM models presented in Sec¬ 
tion 2.2, For PNMM, we used ^ = 0.7, and for GBM, we considered 5 = 1. We generated 6 


databases using the two models and different number of endmembers (5 and 8 ). Each database 
had 2000 pixels generated using random abundances drawn uniformly from the simplex defined 
by ((C). The additive noise power was chosen to produce a 21dB SNR. Eor each dataset we 
unmixed the data using FCLS, SK-Hype without band selection and the proposed algorithm 
for Nh = [10,100, 300]. For both SK-Hype and KKMBS, we used the Gaussian kernel 28 with 
bandwidth cr| = 0.3. Table a presents the RMSEs and the RETs obtained for each simulation. 
The RETs within parentheses include the time for both band separation and unmixing as in¬ 
dicated. These results show comparable RMSEs for all nonlinear unmixing results and very 
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significant improvements in RET when using the proposed band selection approach (about 145 
times for Nb = 10). 

Table shows the results when the nonlinearity parameters (<5 for GBM and ^ for PNMM) 
were set differently for each band. Parameter 6 (^) varied in the interval [0.5,1] ([0.5, 0.9]) with 
steps of 0.05 (0.04), changing at every 42 bands. These results corroborate the results shown 
in Table We have also compared the results of KKMBS with those obtained after randomly 
selecting the bands. Figure shows the histograms of the RMSE when selecting 10 and 100 
bands. It is clear that the KKMBS leads to significantly better results when few bands are 
selected. 


Table 1: RMSE and RET for SNR = 21dB and random abundances. 



PNMM 


5 endmembers 

8 endmembers 

Alg. (M) 

RMSE 

RET (BS-tHU) 

RMSE 

RET (BS-tHU) 

FCLS 

0.1893 

1 

0.1243 

1 

SK-Hype 

0.1136 

2690.6 

0.0762 

3028.8 

SK-Hype(lO) 

0.1114 

16.0 (18.1) 

0.0775 

16.8 (18.7) 

SK-Hype(lOO) 

0.1150 

107.7 (129.1) 

0.0766 

118.6 (144.8) 

SK-Hype(300) 

0.1139 

1226.7 (1327.1) 

0.0763 

1331.5 (1452.7) 


GBM 

FCLS 

0.2419 

1 

0.1836 

1 

SK-Hype 

0.1080 

3320.7 

0.0738 

3072.6 

SK-Hype(lO) 

0.103T 

24.1 (26.9) 

0.0712 

18.7 (21.1) 

SK-Hype(lOO) 

0.1095 

157.0 (194.6) 

0.0741 

119.7 (148.1) 

SK-Hype(300) 

0.1083 

1548.6 (1676.0) 

0.0738 

1475.2 (1597.7) 


Table 2: RMSE and RET for different nonlinearities in each band. 



PNMM 

5 Endmembers 

8 Endmembers 

Alg. {N,) 

RMSE 

RET (BS-tHU) 

RMSE 

RET (BS-tHU) 

FCLS 

0.1966 

1 

0.1521 

1 

SK-Hype 

0.1127 

3744.9 

0.0765 

3672.2 

SK-Hype(lO) 

0.1131 

21.7 (24.3) 

0.0827 

21.5 (24.4) 

SK-Hype(lOO) 

0.1140 

153.4 (184.2) 

0.0771 

140.3 (170.8) 

SK-Hype(300) 

0.1129 

1764.3 (1907.5) 

0.0765 

1601.0 (1743.7) 


GBM 


5 Endmembers 

8 Endmembers 

FCLS 

0.0527 

1 

0.0390 

1 

SK-Hype 

0.1090 

3432.5 

0.0745 

3825.1 

SK-Hype(lO) 

0.1053 

20.8 (23.6) 

0.0730 

20.9 (23.0) 

SK-Hype(lOO) 

0.1105 

142.2 (171.5) 

0.0748 

142.9 (173.8) 

SK-Hype(300) 

0.1093 

1612.8 (1735.2) 

0.0745 

1637.8 (1778.1) 


6 Conclusions 

This work proposed a supervised band selection strategy to reduce the complexity of nonlinear 
hyperspectral data unmixing without compromising the accuracy of abundance estimation. 
Significant reduction in processing time was achieved in all cases tested. These results suggest 
the possibility of important complexity reduction for nonlinear HI processing algorithms without 
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(a) 10 bands. 


(b) 100 bands. 


Figure 2: RMSE for 10 and 100 bands. 


performance loss. It is conjectured that presented performance can be further improved by 
removing redundancy in the data through a more specialized band selection algorithm. This is 
the topic of a work in progress. 
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